AITopics | Westport

Collaborating Authors

Westport

Why Are There So Many 'Alternative Devices' All of a Sudden?

The Atlantic - TechnologyMay-17-2025, 11:00:29 GMT

On a recent commute to work, I texted my distant family about our fantasy baseball league, which was nice because I felt connected to them for a second. Then I switched apps and became enraged by a stupid opinion I saw on X, which I shouldn't be using anymore due to its advanced toxicity and mind-numbing inanity. Many minutes passed before I was able to stop reading the stupid replies to the stupid original post and relax the muscles of my face. This is the duality of the phone: It connects me to my loved ones, and sometimes I think it's ruining my life. I need it and I want it, but sometimes I hate it and I fear it.

alternative device, pinwheel, social media, (16 more...)

The Atlantic - Technology

Country:

North America > United States > Utah > Utah County > Lehi (0.05)
North America > United States > New York > Westchester County > Rye (0.05)
North America > United States > Connecticut > Fairfield County > Westport (0.05)

Industry:

Leisure & Entertainment > Games (0.49)
Leisure & Entertainment > Sports (0.34)

Technology:

Information Technology > Artificial Intelligence (1.00)
Information Technology > Communications > Social Media (0.50)
Information Technology > Communications > Mobile (0.49)

Add feedback

Understanding LLMs' Fluid Intelligence Deficiency: An Analysis of the ARC Task

Wu, Junjie, Yu, Mo, Liu, Lemao, Yeung, Dit-Yan, Zhou, Jie

arXiv.org Artificial IntelligenceFeb-10-2025

While LLMs have exhibited strong performance on various NLP tasks, it is noteworthy that most of these tasks rely on utilizing the vast amount of knowledge encoded in LLMs' parameters, rather than solving new problems without prior knowledge. In cognitive research, the latter ability is referred to as fluid intelligence, which is considered to be critical for assessing human intelligence. Recent research on fluid intelligence assessments has highlighted significant deficiencies in LLMs' abilities. In this paper, we analyze the challenges LLMs face in demonstrating fluid intelligence through controlled experiments, using the most representative ARC task as an example. Our study revealed three major limitations in existing LLMs: limited ability for skill composition, unfamiliarity with abstract input formats, and the intrinsic deficiency of left-to-right decoding. Our data and code can be found in https://wujunjie1998.github.io/araoc-benchmark.github.io/.

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2502.0719

Country:

Europe > Austria > Vienna (0.14)
North America > United States > Connecticut > Fairfield County > Westport (0.04)
North America > United States > California > San Diego County > San Diego (0.04)
(2 more...)

Genre:

Research Report > New Finding (0.88)
Research Report > Experimental Study (0.68)

Industry: Government (0.34)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Cognitive Science (1.00)

Add feedback

Zoning in American Cities: Are Reforms Making a Difference? An AI-based Analysis

Salazar-Miranda, Arianna, Talen, Emily

arXiv.org Artificial IntelligenceJan-6-2025

Cities are at the forefront of addressing global sustainability challenges, particularly those exacerbated by climate change. Traditional zoning codes, which often segregate land uses, have been linked to increased vehicular dependence, urban sprawl, and social disconnection, undermining broader social and environmental sustainability objectives. This study investigates the adoption and impact of form-based codes (FBCs), which aim to promote sustainable, compact, and mixed-use urban forms as a solution to these issues. Using Natural Language Processing (NLP) techniques, we analyzed zoning documents from over 2000 U.S. census-designated places to identify linguistic patterns indicative of FBC principles. Our findings reveal widespread adoption of FBCs across the country, with notable variations within regions. FBCs are associated with higher floor-to-area ratios, narrower and more consistent street setbacks, and smaller plots. We also find that places with FBCs have improved walkability, shorter commutes, and a higher share of multi-family housing. Our findings highlight the utility of NLP for evaluating zoning codes and underscore the potential benefits of form-based zoning reforms for enhancing urban sustainability.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2502.00008

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.04)
North America > United States > Illinois > Cook County > Chicago (0.04)
Pacific Ocean > North Pacific Ocean > San Francisco Bay (0.04)
(12 more...)

Genre: Research Report > New Finding (1.00)

Industry:

Banking & Finance > Real Estate (1.00)
Law > Real Estate Law (0.68)
Government > Regional Government > North America Government > United States Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.47)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)

Add feedback

GEMS: Generative Expert Metric System through Iterative Prompt Priming

Cheng, Ti-Chung, Badea, Carmen, Bird, Christian, Zimmermann, Thomas, DeLine, Robert, Forsgren, Nicole, Ford, Denae

arXiv.org Artificial IntelligenceOct-1-2024

Across domains, metrics and measurements are fundamental to identifying challenges, informing decisions, and resolving conflicts. Despite the abundance of data available in this information age, not only can it be challenging for a single expert to work across multi-disciplinary data, but non-experts can also find it unintuitive to create effective measures or transform theories into context-specific metrics that are chosen appropriately. This technical report addresses this challenge by examining software communities within large software corporations, where different measures are used as proxies to locate counterparts within the organization to transfer tacit knowledge. We propose a prompt-engineering framework inspired by neural activities, demonstrating that generative models can extract and summarize theories and perform basic reasoning, thereby transforming concepts into context-aware metrics to support software communities given software repository data. While this research zoomed in on software communities, we believe the framework's applicability extends across various fields, showcasing expert-theory-inspired metrics that aid in triaging complex challenges.

gem, metric, pull request, (14 more...)

arXiv.org Artificial Intelligence

2410.0088

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
North America > United States > Illinois > Champaign County > Urbana (0.04)
North America > United States > Washington > King County > Seattle (0.04)
(6 more...)

Genre: Research Report > New Finding (0.46)

Industry: Information Technology > Security & Privacy (0.46)

Technology:

Information Technology > Software Engineering (1.00)
Information Technology > Software (1.00)
Information Technology > Knowledge Management (1.00)
(5 more...)

Add feedback

Beyond Preferences in AI Alignment

Zhi-Xuan, Tan, Carroll, Micah, Franklin, Matija, Ashton, Hal

arXiv.org Artificial IntelligenceAug-29-2024

The dominant practice of AI alignment assumes (1) that preferences are an adequate representation of human values, (2) that human rationality can be understood in terms of maximizing the satisfaction of preferences, and (3) that AI systems should be aligned with the preferences of one or more humans to ensure that they behave safely and in accordance with our values. Whether implicitly followed or explicitly endorsed, these commitments constitute what we term a preferentist approach to AI alignment. In this paper, we characterize and challenge the preferentist approach, describing conceptual and technical alternatives that are ripe for further research. We first survey the limits of rational choice theory as a descriptive model, explaining how preferences fail to capture the thick semantic content of human values, and how utility representations neglect the possible incommensurability of those values. We then critique the normativity of expected utility theory (EUT) for humans and AI, drawing upon arguments showing how rational agents need not comply with EUT, while highlighting how EUT is silent on which preferences are normatively acceptable. Finally, we argue that these limitations motivate a reframing of the targets of AI alignment: Instead of alignment with the preferences of a human user, developer, or humanity-writ-large, AI systems should be aligned with normative standards appropriate to their social roles, such as the role of a general-purpose assistant. Furthermore, these standards should be negotiated and agreed upon by all relevant stakeholders. On this alternative conception of alignment, a multiplicity of AI systems will be able to serve diverse ends, aligned with normative standards that promote mutual benefit and limit harm despite our plural and divergent values.

ai system, alignment, utility function, (13 more...)

arXiv.org Artificial Intelligence

2408.16984

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)
North America > United States > California > Los Angeles County > Los Angeles (0.14)
(14 more...)

Genre:

Research Report (0.63)
Overview (0.45)

Industry:

Law (1.00)
Government (1.00)
Health & Medicine (0.92)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
(7 more...)

Add feedback

A Network Analysis Approach to Conlang Research Literature

Gonzalez, Simon

arXiv.org Artificial IntelligenceJul-22-2024

The field of conlang has evidenced an important growth in the last decades. This has been the product of a wide interest in the use and study of conlangs for artistic purposes. However, one important question is what it is happening with conlang in the academic world. This paper aims to have an overall understanding of the literature on conlang research. With this we aim to give a realistic picture of the field in present days. We have implemented a computational linguistic approach, combining bibliometrics and network analysis to examine all publications available in the Scopus database. Analysing over 2300 academic publications since 1927 until 2022, we have found that Esperanto is by far the most documented conlang. Three main authors have contributed to this: Garv\'ia R., Fiedler S., and Blanke D. The 1970s and 1980s have been the decades where the foundations of current research have been built. In terms of methodologies, language learning and experimental linguistics are the ones contributing to most to the preferred approaches of study in the field. We present the results and discuss our limitations and future work.

conlang, linguistics, publication, (13 more...)

arXiv.org Artificial Intelligence

2407.1537

Country:

Europe > Austria > Vienna (0.14)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
Oceania > Australia (0.04)
(4 more...)

Genre: Research Report (1.00)

Industry: Education (0.67)

Technology:

Information Technology > Communications > Networks (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)

Add feedback

Show, Don't Tell: Evaluating Large Language Models Beyond Textual Understanding with ChildPlay

de Carvalho, Gonçalo Hora, Pollice, Robert, Knap, Oscar

arXiv.org Artificial IntelligenceJul-17-2024

We explore the hypothesis that LLMs, such as GPT-3.5 and GPT-4, possess broader cognitive functions, particularly in non-linguistic domains. Our approach extends beyond standard linguistic benchmarks by incorporating games like Tic-Tac-Toe, Connect Four, and Battleship, encoded via ASCII, to assess strategic thinking and decision-making. To evaluate the models' ability to generalize beyond their training data, we introduce two additional games. The first game, LEGO Connect Language (LCL), tests the models' capacity to understand spatial logic and follow assembly instructions. The second game, the game of shapes, challenges the models to identify shapes represented by 1s within a matrix of zeros, further testing their spatial reasoning skills. This "show, don't tell" strategy uses games instead of simply querying the models. Our results show that despite their proficiency on standard benchmarks, GPT-3.5 and GPT-4's abilities to play and reason about fully observable games without pre-training is mediocre. Both models fail to anticipate losing moves in Tic-Tac-Toe and Connect Four, and they are unable to play Battleship correctly. While GPT-4 shows some success in the game of shapes, both models fail at the assembly tasks presented in the LCL game. These results suggest that while GPT models can emulate conversational proficiency and basic rule comprehension, their performance in strategic gameplay and spatial reasoning tasks is very limited. Importantly, this reveals a blind spot in current LLM benchmarks that we highlight with our gameplay benchmark suite ChildPlay (https://github.com/child-play-neurips/child-play). Our findings provide a cautionary tale about claims of emergent intelligence and reasoning capabilities of LLMs that are roughly the size of GPT-3.5 and GPT-4.

gpt-3, iteration, quantity, (17 more...)

arXiv.org Artificial Intelligence

2407.11068

Country:

North America > United States > Connecticut > Fairfield County > Westport (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > Netherlands > South Holland > The Hague (0.04)

Genre: Research Report > New Finding (1.00)

Industry: Leisure & Entertainment > Games > Tic-Tac-Toe (0.71)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

How Well Do LLMs Represent Values Across Cultures? Empirical Analysis of LLM Responses Based on Hofstede Cultural Dimensions

Kharchenko, Julia, Roosta, Tanya, Chadha, Aman, Shah, Chirag

arXiv.org Artificial IntelligenceJun-20-2024

Large Language Models (LLMs) attempt to imitate human behavior by responding to humans in a way that pleases them, including by adhering to their values. However, humans come from diverse cultures with different values. It is critical to understand whether LLMs showcase different values to the user based on the stereotypical values of a user's known country. We prompt different LLMs with a series of advice requests based on 5 Hofstede Cultural Dimensions -- a quantifiable way of representing the values of a country. Throughout each prompt, we incorporate personas representing 36 different countries and, separately, languages predominantly tied to each country to analyze the consistency in the LLMs' cultural understanding. Through our analysis of the responses, we found that LLMs can differentiate between one side of a value and another, as well as understand that countries have differing values, but will not always uphold the values when giving advice, and fail to understand the need to answer differently based on different cultural values. Rooted in these findings, we present recommendations for training value-aligned and culturally sensitive LLMs. More importantly, the methodology and the framework developed here can help further understand and mitigate culture and language alignment issues with LLMs.

cultural dimension, llm, persona, (13 more...)

arXiv.org Artificial Intelligence

2406.14805

Country:

Asia > Japan (0.05)
Europe > Russia (0.04)
Asia > Russia (0.04)
(42 more...)

Genre:

Research Report > New Finding (0.68)
Research Report > Experimental Study (0.46)

Industry: Health & Medicine > Therapeutic Area (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Iterative Utility Judgment Framework via LLMs Inspired by Relevance in Philosophy

Zhang, Hengran, Bi, Keping, Guo, Jiafeng, Cheng, Xueqi

arXiv.org Artificial IntelligenceJun-17-2024

Utility and topical relevance are critical measures in information retrieval (IR), reflecting system and user perspectives, respectively. While topical relevance has long been emphasized, utility is a higher standard of relevance and is more useful for facilitating downstream tasks, e.g., in Retrieval-Augmented Generation (RAG). When we incorporate utility judgments into RAG, we realize that the topical relevance, utility, and answering in RAG are closely related to the three types of relevance that Schutz discussed from a philosophical perspective. They are topical relevance, interpretational relevance, and motivational relevance, respectively. Inspired by the dynamic iterations of the three types of relevance, we propose an Iterative utiliTy judgmEnt fraMework (ITEM) to promote each step of the cycle of RAG. We conducted extensive experiments on multi-grade passage retrieval and factoid question-answering datasets (i.e., TREC DL, WebAP, and NQ). Experimental results demonstrate significant improvements in utility judgments, ranking of topical relevance, and answer generation upon representative baselines, including multiple single-shot utility judging approaches. Our code and benchmark can be found at https://anonymous.4open.science/r/ITEM-B486/.

dataset, relevance, utility judgment, (16 more...)

arXiv.org Artificial Intelligence

2406.1129

Country:

Asia > Singapore (0.04)
North America > United States > New York > New York County > New York City (0.04)
North America > United States > Washington > King County > Seattle (0.04)
(10 more...)

Genre: Research Report > New Finding (1.00)

Industry:

Media > Television (0.46)
Leisure & Entertainment (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

What is the Visual Cognition Gap between Humans and Multimodal LLMs?

Cao, Xu, Lai, Bolin, Ye, Wenqian, Ma, Yunsheng, Heintz, Joerg, Chen, Jintai, Cao, Jianguo, Rehg, James M.

arXiv.org Artificial IntelligenceJun-14-2024

Recently, Multimodal Large Language Models (MLLMs) have shown great promise in language-guided perceptual tasks such as recognition, segmentation, and object detection. However, their effectiveness in addressing visual cognition problems that require high-level reasoning is not well-established. One such challenge is abstract visual reasoning (AVR) - the cognitive ability to discern relationships among patterns in a set of images and extrapolate to predict subsequent patterns. This skill is crucial during the early neurodevelopmental stages of children. Inspired by the AVR tasks in Raven's Progressive Matrices (RPM) and Wechsler Intelligence Scale for Children (WISC), we propose a new dataset MaRs-VQA and a new benchmark VCog-Bench containing three datasets to evaluate the zero-shot AVR capability of MLLMs and compare their performance with existing human intelligent investigation. Our comparative experiments with different open-source and closed-source MLLMs on the VCog-Bench revealed a gap between MLLMs and human intelligence, highlighting the visual cognitive limitations of current MLLMs. We believe that the public release of VCog-Bench, consisting of MaRs-VQA, and the inference pipeline will drive progress toward the next generation of MLLMs with human-like visual cognition abilities. The code and datasets for our benchmark are at GitHub.com/IrohXu/VCog-Bench

arxiv preprint arxiv, mllm, reasoning, (14 more...)

arXiv.org Artificial Intelligence

2406.10424

Country:

North America > United States > Virginia (0.04)
North America > United States > New York (0.04)
North America > United States > Illinois > Champaign County > Urbana (0.04)
(2 more...)

Genre: Research Report > New Finding (0.46)

Industry: Health & Medicine > Therapeutic Area > Neurology (0.88)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Cognitive Science (1.00)

Add feedback